Genomes Containing Duplicates Are Hard to Compare

نویسندگان

Cédric Chauve

Guillaume Fertin

Romeo Rizzi

Stéphane Vialette

چکیده

In this paper, we are interested in the algorithmic complexity of computing (dis)similarity measures between two genomes when they contain duplicated genes. In that case, there are usually two main ways to compute a given (dis)similarity measure M between two genomes G1 and G2: the first model, that we will call the matching model, consists in making a one-to-one correspondence between genes of G1 and genes of G2, in such a way that M is optimized. The second model, called the exemplar model, consists in keeping in G1 (resp. G2) exactly one copy of each gene, thus deleting all the other copies, in such a way that M is optimized. We present here different results concerning the algorithmic complexity of computing three different similarity measures (number of common intervals, MAD number and SAD number) in those two models, basically showing that the problem becomes NP-complete for each of them as soon as genomes contain duplicates. We show indeed that for common intervals, MAD and SAD, the problem is NP-complete when genes are duplicated in genomes, in both the exemplar and matching models. In the case of MAD and SAD, we actually prove that, under both models, both MAD and SAD problems are APX-hard.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Genomes containing Duplicates are Hard to compare ( Extended Abstract ) ?

متن کامل

On the Approximability of Comparing Genomes with Duplicates

A central problem in comparative genomics consists in computing a (dis-)similarity measure between two genomes, e.g. in order to construct a phylogenetic tree. A large number of such measures has been proposed in the recent past: number of reversals, number of breakpoints, number of common or conserved intervals, SAD etc. In their initial definitions, all these measures suppose that genomes con...

متن کامل

Expressed Structurally Stable Inverted Duplicates in Mammalian Genomes as Functional Noncoding Elements

Inverted duplicates are a type of repetitive DNA motifs consist of two copies of reverse complementary sequences separated by a spacer sequence. They can lead to genome instability and many may have no function, but some functional small RNAs are processed from hairpins transcribed from these elements. It is not clear whether the pervasive numbers of such elements in genomes, especially those o...

متن کامل

A Pseudo-Boolean Framework for Computing Rearrangement Distances between Genomes with Duplicates

Computing genomic distances between whole genomes is a fundamental problem in comparative genomics. Recent researches have resulted in different genomic distance definitions, for example, number of breakpoints, number of common intervals, number of conserved intervals, and Maximum Adjacency Disruption number. Unfortunately, it turns out that, in presence of duplications, most problems are NP-ha...

متن کامل

Dynamics of Gene Duplication in the Genomes of Chlorophyll d-Producing Cyanobacteria: Implications for the Ecological Niche

Gene duplication may be an important mechanism for the evolution of new functions and for the adaptive modulation of gene expression via dosage effects. Here, we analyzed the fate of gene duplicates for two strains of a novel group of cyanobacteria (genus Acaryochloris) that produces the far-red light absorbing chlorophyll d as its main photosynthetic pigment. The genomes of both strains contai...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2006

Genomes Containing Duplicates Are Hard to Compare

نویسندگان

چکیده

منابع مشابه

Genomes containing Duplicates are Hard to compare ( Extended Abstract ) ?

On the Approximability of Comparing Genomes with Duplicates

Expressed Structurally Stable Inverted Duplicates in Mammalian Genomes as Functional Noncoding Elements

A Pseudo-Boolean Framework for Computing Rearrangement Distances between Genomes with Duplicates

Dynamics of Gene Duplication in the Genomes of Chlorophyll d-Producing Cyanobacteria: Implications for the Ecological Niche

عنوان ژورنال:

اشتراک گذاری